Sliding Reservoir Approach for Delayed Labeling in Streaming Data Classification

نویسندگان

  • Hanqing Hu
  • Mehmed M. Kantardzic
چکیده

When concept drift occurs within streaming data, a streaming data classification framework needs to update the learning model to maintain its performance. Labeled samples required for training a new model are often unavailable immediately in real world applications. This delay of labels might negatively impact the performance of traditional streaming data classification frameworks. To solve this problem, we propose Sliding Reservoir Approach for Delayed Labeling (SRADL). By combining chunk based semisupervised learning with a novel approach to manage labeled data, SRADL does not need to wait for the labeling process to finish before updating the learning model. Experiments with two delayed-label scenarios show that SRADL improves prediction performance over the naïve approach by as much as 7.5% in certain cases. The most gain comes from 18-chunk labeling delay time with continuous labeling delivery scenario in real world data experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طبقه بندی و شناسایی رخساره‌های زمین‌شناسی با استفاده از داده‌های لرزه نگاری و شبکه‌های عصبی رقابتی

Geological facies interpretation is essential for reservoir studying. The method of classification and identification seismic traces is a powerful approach for geological facies classification and distinction. Use of neural networks as classifiers is increasing in different sciences like seismic. They are computer efficient and ideal for patterns identification. They can simply learn new algori...

متن کامل

Fuzzy Data Envelopment Analysis for Classification of Streaming Data

The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...

متن کامل

Fuzzy Data Envelopment Analysis for Classification of Streaming Data

The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...

متن کامل

Classification of Streaming Fuzzy DEA Using Self-Organizing Map

The classification of fuzzy data is considered as the most challenging areas of data analysis and the complexity of the procedures has been obstacle to the development of new methods for fuzzy data analysis. However, there are significant advances in modeling systems in which fuzzy data are available in the field of mathematical programming. In order to exploit the results of the researches on ...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017